Low-Frequency Bandwidth Extension of Telephone Speech Using Sinusoidal Synthesis and Gaussian Mixture Model

نویسندگان

  • Hannu Pulakka
  • Ulpu Remes
  • Santeri Yrttiaho
  • Kalle J. Palomäki
  • Mikko Kurimo
  • Paavo Alku
چکیده

The limited audio bandwidth of narrowband telephone speech degrades the speech quality. This paper proposes a method that extends the bandwidth of telephone speech to the frequency range 0–300 Hz. The lowest harmonics of voiced speech are generated using sinusoidal synthesis. The energy in the extension band is estimated from spectral features using a Gaussian mixture model. The amplitudes and phases of the synthesized signal are adjusted based on the amplitudes and phases of the narrowband input speech. The proposed method was evaluated with listening tests together with a bandwidth extension method for the range 4–8 kHz. The low-frequency bandwidth extension was found to reduce dissimilarity with wideband speech but no perceived quality improvement was achieved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Bandwidth Extension Using Bottleneck Features and Deep Recurrent Neural Networks

This paper presents a novel method for speech bandwidth extension (BWE) using deep structured neural networks. In order to utilize linguistic information during the prediction of high-frequency spectral components, the bottleneck (BN) features derived from a deep neural network (DNN)-based state classifier for narrowband speech are employed as auxiliary input. Furthermore, recurrent neural netw...

متن کامل

Speech Enhancement using Laplacian Mixture Model under Signal Presence Uncertainty

In this paper an estimator for speech enhancement based on Laplacian Mixture Model has been proposed. The proposed method, estimates the complex DFT coefficients of clean speech from noisy speech using the MMSE  estimator, when the clean speech DFT coefficients are supposed mixture of Laplacians and the DFT coefficients of  noise are assumed zero-mean Gaussian distribution. Furthermore, the MMS...

متن کامل

Speech enhancement using STC-based bandwidth extension

Telephone speech is typically bandlimited to 4 kHz, resulting in a ‘muffled’ quality. Coding speech with bandwidth greater than 4 kHz reduces this distortion, but requires a higher bit rate to avoid other types of distortion. An alternative to coding wider bandwidth speech is to exploit correlation between the 0-4 kHz and 4-8 kHz speech bands to resynthesize wideband speech from narrowband spee...

متن کامل

Speech Bandwidth Extension Using Articulatory Features

In this paper, we present a technique for bandwidth extension (BWE) of a narrowband (0 4 kHz) signal using articulatory features. The proposed technique recovers high-band components (4 8 kHz) through Gaussian mixture regression (GMR) on both the acoustic and articulatory features from the X-ray Microbeam (XRMB) speech production database. The Gaussian mixture model (GMM) that is based on acous...

متن کامل

Memory-Based Approximation of the Gaussian Mixture Model Framework for Bandwidth Extension of Narrowband Speech

In this paper, we extend our previous work on exploiting speech temporal properties to improve Bandwidth Extension (BWE) of narrowband speech using Gaussian Mixture Models (GMMs). By quantifying temporal properties through information theoretic measures and using delta features, we have shown that narrowband memory significantly increases certainty about highband parameters. However, as delta f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011